Dataset statistics
| Number of variables | 32 |
|---|---|
| Number of observations | 119390 |
| Missing cells | 129425 |
| Missing cells (%) | 3.4% |
| Duplicate rows | 31994 |
| Duplicate rows (%) | 26.8% |
| Total size in memory | 105.7 MiB |
| Average record size in memory | 928.7 B |
Variable types
| NUM | 17 |
|---|---|
| CAT | 13 |
| BOOL | 2 |
Reproduction
| Analysis started | 2020-04-11 05:26:16.402833 |
|---|---|
| Analysis finished | 2020-04-11 05:29:44.481746 |
| Version | pandas-profiling v2.5.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
| Dataset has 31994 (26.8%) duplicate rows | Duplicates |
country has a high cardinality: 177 distinct values | High cardinality |
reservation_status_date has a high cardinality: 926 distinct values | High cardinality |
agent has 16340 (13.7%) missing values | Missing |
company has 112593 (94.3%) missing values | Missing |
babies is highly skewed (γ1 = 24.64654483) | Skewed |
previous_cancellations is highly skewed (γ1 = 24.45804872) | Skewed |
previous_bookings_not_canceled is highly skewed (γ1 = 23.53979995) | Skewed |
reservation_status_date only contains datetime values, but is categorical. Consider applying pd.to_datetime() | Type |
lead_time has 6345 (5.3%) zeros | Zeros |
stays_in_weekend_nights has 51998 (43.6%) zeros | Zeros |
stays_in_week_nights has 7645 (6.4%) zeros | Zeros |
children has 110796 (92.8%) zeros | Zeros |
babies has 118473 (99.2%) zeros | Zeros |
previous_cancellations has 112906 (94.6%) zeros | Zeros |
previous_bookings_not_canceled has 115770 (97.0%) zeros | Zeros |
booking_changes has 101314 (84.9%) zeros | Zeros |
days_in_waiting_list has 115692 (96.9%) zeros | Zeros |
adr has 1959 (1.6%) zeros | Zeros |
required_car_parking_spaces has 111974 (93.8%) zeros | Zeros |
total_of_special_requests has 70318 (58.9%) zeros | Zeros |
hotel
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| City Hotel | |
|---|---|
| Resort Hotel |
| Value | Count | Frequency (%) | |
| City Hotel | 79330 | 66.4% | |
| Resort Hotel | 40060 | 33.6% |
Length
| Max length | 12 |
|---|---|
| Mean length | 10.67107798 |
| Min length | 10 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 8 | 66.7% | |
| Uppercase_Letter | 3 | 25.0% | |
| Space_Separator | 1 | 8.3% |
| Value | Count | Frequency (%) | |
| Latin | 11 | 91.7% | |
| Common | 1 | 8.3% |
| Value | Count | Frequency (%) | |
| ASCII | 12 | 100.0% |
is_canceled
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 75166 | 63.0% | |
| 1 | 44224 | 37.0% |
| Distinct count | 479 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 104.01141636652986 |
|---|---|
| Minimum | 0 |
| Maximum | 737 |
| Zeros | 6345 |
| Zeros (%) | 5.3% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 18 |
| median | 69 |
| Q3 | 160 |
| 95-th percentile | 320 |
| Maximum | 737 |
| Range | 737 |
| Interquartile range (IQR) | 142 |
Descriptive statistics
| Standard deviation | 106.863097 |
|---|---|
| Coefficient of variation (CV) | 1.027416997 |
| Kurtosis | 1.696448849 |
| Mean | 104.0114164 |
| Median Absolute Deviation (MAD) | 84.67197528 |
| Skewness | 1.346549873 |
| Sum | 12417923 |
| Variance | 11419.72151 |
| Value | Count | Frequency (%) | |
| 0 | 6345 | 5.3% | |
| 1 | 3460 | 2.9% | |
| 2 | 2069 | 1.7% | |
| 3 | 1816 | 1.5% | |
| 4 | 1715 | 1.4% | |
| 5 | 1565 | 1.3% | |
| 6 | 1445 | 1.2% | |
| 7 | 1331 | 1.1% | |
| 8 | 1138 | 1.0% | |
| 12 | 1079 | 0.9% | |
| Other values (469) | 97427 | 81.6% |
| Value | Count | Frequency (%) | |
| 0 | 6345 | 5.3% | |
| 1 | 3460 | 2.9% | |
| 2 | 2069 | 1.7% | |
| 3 | 1816 | 1.5% | |
| 4 | 1715 | 1.4% |
| Value | Count | Frequency (%) | |
| 737 | 1 | < 0.1% | |
| 709 | 1 | < 0.1% | |
| 629 | 17 | < 0.1% | |
| 626 | 30 | < 0.1% | |
| 622 | 17 | < 0.1% |
arrival_date_year
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 2016 | |
|---|---|
| 2017 | |
| 2015 |
| Value | Count | Frequency (%) | |
| 2016 | 56707 | 47.5% | |
| 2017 | 40687 | 34.1% | |
| 2015 | 21996 | 18.4% |
Length
| Max length | 4 |
|---|---|
| Mean length | 4 |
| Min length | 4 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 6 | 100.0% |
| Value | Count | Frequency (%) | |
| Common | 6 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 6 | 100.0% |
arrival_date_month
Categorical
| Distinct count | 12 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| August | |
|---|---|
| July | |
| May | |
| October | 11160 |
| April | 11089 |
| Other values (7) |
| Value | Count | Frequency (%) | |
| August | 13877 | 11.6% | |
| July | 12661 | 10.6% | |
| May | 11791 | 9.9% | |
| October | 11160 | 9.3% | |
| April | 11089 | 9.3% | |
| June | 10939 | 9.2% | |
| September | 10508 | 8.8% | |
| March | 9794 | 8.2% | |
| February | 8068 | 6.8% | |
| November | 6794 | 5.7% | |
| Other values (2) | 12709 | 10.6% |
Length
| Max length | 9 |
|---|---|
| Mean length | 5.903182846 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 18 | 69.2% | |
| Uppercase_Letter | 8 | 30.8% |
| Value | Count | Frequency (%) | |
| Latin | 26 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 26 | 100.0% |
arrival_date_week_number
Real number (ℝ≥0)
| Distinct count | 53 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.16517296255968 |
|---|---|
| Minimum | 1 |
| Maximum | 53 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 16 |
| median | 28 |
| Q3 | 38 |
| 95-th percentile | 49 |
| Maximum | 53 |
| Range | 52 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 13.60513836 |
|---|---|
| Coefficient of variation (CV) | 0.500830176 |
| Kurtosis | -0.9860771763 |
| Mean | 27.16517296 |
| Median Absolute Deviation (MAD) | 11.54992462 |
| Skewness | -0.01001432604 |
| Sum | 3243250 |
| Variance | 185.0997897 |
| Value | Count | Frequency (%) | |
| 33 | 3580 | 3.0% | |
| 30 | 3087 | 2.6% | |
| 32 | 3045 | 2.6% | |
| 34 | 3040 | 2.5% | |
| 18 | 2926 | 2.5% | |
| 21 | 2854 | 2.4% | |
| 28 | 2853 | 2.4% | |
| 17 | 2805 | 2.3% | |
| 20 | 2785 | 2.3% | |
| 29 | 2763 | 2.3% | |
| Other values (43) | 89652 | 75.1% |
| Value | Count | Frequency (%) | |
| 1 | 1047 | 0.9% | |
| 2 | 1218 | 1.0% | |
| 3 | 1319 | 1.1% | |
| 4 | 1487 | 1.2% | |
| 5 | 1387 | 1.2% |
| Value | Count | Frequency (%) | |
| 53 | 1816 | 1.5% | |
| 52 | 1195 | 1.0% | |
| 51 | 933 | 0.8% | |
| 50 | 1505 | 1.3% | |
| 49 | 1782 | 1.5% |
arrival_date_day_of_month
Real number (ℝ≥0)
| Distinct count | 31 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.798241058715135 |
|---|---|
| Minimum | 1 |
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 30 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.780829471 |
|---|---|
| Coefficient of variation (CV) | 0.5558105765 |
| Kurtosis | -1.187168319 |
| Mean | 15.79824106 |
| Median Absolute Deviation (MAD) | 7.578562929 |
| Skewness | -0.002000453979 |
| Sum | 1886152 |
| Variance | 77.10296619 |
| Value | Count | Frequency (%) | |
| 17 | 4406 | 3.7% | |
| 5 | 4317 | 3.6% | |
| 15 | 4196 | 3.5% | |
| 25 | 4160 | 3.5% | |
| 26 | 4147 | 3.5% | |
| 9 | 4096 | 3.4% | |
| 12 | 4087 | 3.4% | |
| 16 | 4078 | 3.4% | |
| 2 | 4055 | 3.4% | |
| 19 | 4052 | 3.4% | |
| Other values (21) | 77796 | 65.2% |
| Value | Count | Frequency (%) | |
| 1 | 3626 | 3.0% | |
| 2 | 4055 | 3.4% | |
| 3 | 3855 | 3.2% | |
| 4 | 3763 | 3.2% | |
| 5 | 4317 | 3.6% |
| Value | Count | Frequency (%) | |
| 31 | 2208 | 1.8% | |
| 30 | 3853 | 3.2% | |
| 29 | 3580 | 3.0% | |
| 28 | 3946 | 3.3% | |
| 27 | 3802 | 3.2% |
| Distinct count | 17 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9275986263506156 |
|---|---|
| Minimum | 0 |
| Maximum | 19 |
| Zeros | 51998 |
| Zeros (%) | 43.6% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 19 |
| Range | 19 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 0.9986134946 |
|---|---|
| Coefficient of variation (CV) | 1.076557755 |
| Kurtosis | 7.174066064 |
| Mean | 0.9275986264 |
| Median Absolute Deviation (MAD) | 0.8079951985 |
| Skewness | 1.38004645 |
| Sum | 110746 |
| Variance | 0.9972289116 |
| Value | Count | Frequency (%) | |
| 0 | 51998 | 43.6% | |
| 2 | 33308 | 27.9% | |
| 1 | 30626 | 25.7% | |
| 4 | 1855 | 1.6% | |
| 3 | 1259 | 1.1% | |
| 6 | 153 | 0.1% | |
| 5 | 79 | 0.1% | |
| 8 | 60 | 0.1% | |
| 7 | 19 | < 0.1% | |
| 9 | 11 | < 0.1% | |
| Other values (7) | 22 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 51998 | 43.6% | |
| 1 | 30626 | 25.7% | |
| 2 | 33308 | 27.9% | |
| 3 | 1259 | 1.1% | |
| 4 | 1855 | 1.6% |
| Value | Count | Frequency (%) | |
| 19 | 1 | < 0.1% | |
| 18 | 1 | < 0.1% | |
| 16 | 3 | < 0.1% | |
| 14 | 2 | < 0.1% | |
| 13 | 3 | < 0.1% |
| Distinct count | 35 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.500301532791691 |
|---|---|
| Minimum | 0 |
| Maximum | 50 |
| Zeros | 7645 |
| Zeros (%) | 6.4% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.908285615 |
|---|---|
| Coefficient of variation (CV) | 0.7632221914 |
| Kurtosis | 24.28455482 |
| Mean | 2.500301533 |
| Median Absolute Deviation (MAD) | 1.364286816 |
| Skewness | 2.862249242 |
| Sum | 298511 |
| Variance | 3.641553989 |
| Value | Count | Frequency (%) | |
| 2 | 33684 | 28.2% | |
| 1 | 30310 | 25.4% | |
| 3 | 22258 | 18.6% | |
| 5 | 11077 | 9.3% | |
| 4 | 9563 | 8.0% | |
| 0 | 7645 | 6.4% | |
| 6 | 1499 | 1.3% | |
| 10 | 1036 | 0.9% | |
| 7 | 1029 | 0.9% | |
| 8 | 656 | 0.5% | |
| Other values (25) | 633 | 0.5% |
| Value | Count | Frequency (%) | |
| 0 | 7645 | 6.4% | |
| 1 | 30310 | 25.4% | |
| 2 | 33684 | 28.2% | |
| 3 | 22258 | 18.6% | |
| 4 | 9563 | 8.0% |
| Value | Count | Frequency (%) | |
| 50 | 1 | < 0.1% | |
| 42 | 1 | < 0.1% | |
| 41 | 1 | < 0.1% | |
| 40 | 2 | < 0.1% | |
| 35 | 1 | < 0.1% |
adults
Real number (ℝ≥0)
| Distinct count | 14 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.8564033838679956 |
|---|---|
| Minimum | 0 |
| Maximum | 55 |
| Zeros | 403 |
| Zeros (%) | 0.3% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 55 |
| Range | 55 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5792609988 |
|---|---|
| Coefficient of variation (CV) | 0.3120340137 |
| Kurtosis | 1352.115116 |
| Mean | 1.856403384 |
| Median Absolute Deviation (MAD) | 0.3428851878 |
| Skewness | 18.31780476 |
| Sum | 221636 |
| Variance | 0.3355433048 |
| Value | Count | Frequency (%) | |
| 2 | 89680 | 75.1% | |
| 1 | 23027 | 19.3% | |
| 3 | 6202 | 5.2% | |
| 0 | 403 | 0.3% | |
| 4 | 62 | 0.1% | |
| 26 | 5 | < 0.1% | |
| 27 | 2 | < 0.1% | |
| 20 | 2 | < 0.1% | |
| 5 | 2 | < 0.1% | |
| 55 | 1 | < 0.1% | |
| Other values (4) | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 403 | 0.3% | |
| 1 | 23027 | 19.3% | |
| 2 | 89680 | 75.1% | |
| 3 | 6202 | 5.2% | |
| 4 | 62 | 0.1% |
| Value | Count | Frequency (%) | |
| 55 | 1 | < 0.1% | |
| 50 | 1 | < 0.1% | |
| 40 | 1 | < 0.1% | |
| 27 | 2 | < 0.1% | |
| 26 | 5 | < 0.1% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 4 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.10388990333874994 |
|---|---|
| Minimum | 0.0 |
| Maximum | 10.0 |
| Zeros | 110796 |
| Zeros (%) | 92.8% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3985614448 |
|---|---|
| Coefficient of variation (CV) | 3.836382863 |
| Kurtosis | 18.67369236 |
| Mean | 0.1038899033 |
| Median Absolute Deviation (MAD) | 0.192829741 |
| Skewness | 4.112589542 |
| Sum | 12403 |
| Variance | 0.1588512253 |
| Value | Count | Frequency (%) | |
| 0 | 110796 | 92.8% | |
| 1 | 4861 | 4.1% | |
| 2 | 3652 | 3.1% | |
| 3 | 76 | 0.1% | |
| 10 | 1 | < 0.1% | |
| (Missing) | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 110796 | 92.8% | |
| 1 | 4861 | 4.1% | |
| 2 | 3652 | 3.1% | |
| 3 | 76 | 0.1% | |
| 10 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 10 | 1 | < 0.1% | |
| 3 | 76 | 0.1% | |
| 2 | 3652 | 3.1% | |
| 1 | 4861 | 4.1% | |
| 0 | 110796 | 92.8% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.007948739425412514 |
|---|---|
| Minimum | 0 |
| Maximum | 10 |
| Zeros | 118473 |
| Zeros (%) | 99.2% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.0974361913 |
|---|---|
| Coefficient of variation (CV) | 12.25806837 |
| Kurtosis | 1633.948235 |
| Mean | 0.007948739425 |
| Median Absolute Deviation (MAD) | 0.01577537492 |
| Skewness | 24.64654483 |
| Sum | 949 |
| Variance | 0.009493811375 |
| Value | Count | Frequency (%) | |
| 0 | 118473 | 99.2% | |
| 1 | 900 | 0.8% | |
| 2 | 15 | < 0.1% | |
| 10 | 1 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 118473 | 99.2% | |
| 1 | 900 | 0.8% | |
| 2 | 15 | < 0.1% | |
| 9 | 1 | < 0.1% | |
| 10 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 10 | 1 | < 0.1% | |
| 9 | 1 | < 0.1% | |
| 2 | 15 | < 0.1% | |
| 1 | 900 | 0.8% | |
| 0 | 118473 | 99.2% |
meal
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| BB | |
|---|---|
| HB | 14463 |
| SC | 10650 |
| Undefined | 1169 |
| FB | 798 |
| Value | Count | Frequency (%) | |
| BB | 92310 | 77.3% | |
| HB | 14463 | 12.1% | |
| SC | 10650 | 8.9% | |
| Undefined | 1169 | 1.0% | |
| FB | 798 | 0.7% |
Length
| Max length | 9 |
|---|---|
| Mean length | 2.068540079 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 6 | 54.5% | |
| Lowercase_Letter | 5 | 45.5% |
| Value | Count | Frequency (%) | |
| Latin | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
| Distinct count | 177 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 488 |
| Missing (%) | 0.4% |
| Memory size | 932.9 KiB |
| PRT | |
|---|---|
| GBR | |
| FRA | |
| ESP | 8568 |
| DEU | 7287 |
| Other values (172) |
| Value | Count | Frequency (%) | |
| PRT | 48590 | 40.7% | |
| GBR | 12129 | 10.2% | |
| FRA | 10415 | 8.7% | |
| ESP | 8568 | 7.2% | |
| DEU | 7287 | 6.1% | |
| ITA | 3766 | 3.2% | |
| IRL | 3375 | 2.8% | |
| BEL | 2342 | 2.0% | |
| BRA | 2224 | 1.9% | |
| NLD | 2104 | 1.8% | |
| Other values (167) | 18102 | 15.2% |
Length
| Max length | 3 |
|---|---|
| Mean length | 2.98928721 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 26 | 92.9% | |
| Lowercase_Letter | 2 | 7.1% |
| Value | Count | Frequency (%) | |
| Latin | 28 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 28 | 100.0% |
market_segment
Categorical
| Distinct count | 8 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| Online TA | |
|---|---|
| Offline TA/TO | |
| Groups | |
| Direct | |
| Corporate | 5295 |
| Other values (3) | 982 |
| Value | Count | Frequency (%) | |
| Online TA | 56477 | 47.3% | |
| Offline TA/TO | 24219 | 20.3% | |
| Groups | 19811 | 16.6% | |
| Direct | 12606 | 10.6% | |
| Corporate | 5295 | 4.4% | |
| Complementary | 743 | 0.6% | |
| Aviation | 237 | 0.2% | |
| Undefined | 2 | < 0.1% |
Length
| Max length | 13 |
|---|---|
| Mean length | 9.01976715 |
| Min length | 6 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 17 | 65.4% | |
| Uppercase_Letter | 7 | 26.9% | |
| Other_Punctuation | 1 | 3.8% | |
| Space_Separator | 1 | 3.8% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 92.3% | |
| Common | 2 | 7.7% |
| Value | Count | Frequency (%) | |
| ASCII | 26 | 100.0% |
distribution_channel
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| TA/TO | |
|---|---|
| Direct | 14645 |
| Corporate | 6677 |
| GDS | 193 |
| Undefined | 5 |
| Value | Count | Frequency (%) | |
| TA/TO | 97870 | 82.0% | |
| Direct | 14645 | 12.3% | |
| Corporate | 6677 | 5.6% | |
| GDS | 193 | 0.2% | |
| Undefined | 5 | < 0.1% |
Length
| Max length | 9 |
|---|---|
| Mean length | 5.343303459 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 11 | 55.0% | |
| Uppercase_Letter | 8 | 40.0% | |
| Other_Punctuation | 1 | 5.0% |
| Value | Count | Frequency (%) | |
| Latin | 19 | 95.0% | |
| Common | 1 | 5.0% |
| Value | Count | Frequency (%) | |
| ASCII | 20 | 100.0% |
is_repeated_guest
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 0 | |
|---|---|
| 1 | 3810 |
| Value | Count | Frequency (%) | |
| 0 | 115580 | 96.8% | |
| 1 | 3810 | 3.2% |
| Distinct count | 15 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.08711784906608594 |
|---|---|
| Minimum | 0 |
| Maximum | 26 |
| Zeros | 112906 |
| Zeros (%) | 94.6% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 26 |
| Range | 26 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.8443363842 |
|---|---|
| Coefficient of variation (CV) | 9.691887405 |
| Kurtosis | 674.0736926 |
| Mean | 0.08711784907 |
| Median Absolute Deviation (MAD) | 0.1647730608 |
| Skewness | 24.45804872 |
| Sum | 10401 |
| Variance | 0.7129039296 |
| Value | Count | Frequency (%) | |
| 0 | 112906 | 94.6% | |
| 1 | 6051 | 5.1% | |
| 2 | 116 | 0.1% | |
| 3 | 65 | 0.1% | |
| 24 | 48 | < 0.1% | |
| 11 | 35 | < 0.1% | |
| 4 | 31 | < 0.1% | |
| 26 | 26 | < 0.1% | |
| 25 | 25 | < 0.1% | |
| 6 | 22 | < 0.1% | |
| Other values (5) | 65 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 112906 | 94.6% | |
| 1 | 6051 | 5.1% | |
| 2 | 116 | 0.1% | |
| 3 | 65 | 0.1% | |
| 4 | 31 | < 0.1% |
| Value | Count | Frequency (%) | |
| 26 | 26 | < 0.1% | |
| 25 | 25 | < 0.1% | |
| 24 | 48 | < 0.1% | |
| 21 | 1 | < 0.1% | |
| 19 | 19 | < 0.1% |
| Distinct count | 73 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.13709690928888515 |
|---|---|
| Minimum | 0 |
| Maximum | 72 |
| Zeros | 115770 |
| Zeros (%) | 97.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 72 |
| Range | 72 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.497436848 |
|---|---|
| Coefficient of variation (CV) | 10.92246977 |
| Kurtosis | 767.2452097 |
| Mean | 0.1370969093 |
| Median Absolute Deviation (MAD) | 0.2658800434 |
| Skewness | 23.53979995 |
| Sum | 16368 |
| Variance | 2.242317113 |
| Value | Count | Frequency (%) | |
| 0 | 115770 | 97.0% | |
| 1 | 1542 | 1.3% | |
| 2 | 580 | 0.5% | |
| 3 | 333 | 0.3% | |
| 4 | 229 | 0.2% | |
| 5 | 181 | 0.2% | |
| 6 | 115 | 0.1% | |
| 7 | 88 | 0.1% | |
| 8 | 70 | 0.1% | |
| 9 | 60 | 0.1% | |
| Other values (63) | 422 | 0.4% |
| Value | Count | Frequency (%) | |
| 0 | 115770 | 97.0% | |
| 1 | 1542 | 1.3% | |
| 2 | 580 | 0.5% | |
| 3 | 333 | 0.3% | |
| 4 | 229 | 0.2% |
| Value | Count | Frequency (%) | |
| 72 | 1 | < 0.1% | |
| 71 | 1 | < 0.1% | |
| 70 | 1 | < 0.1% | |
| 69 | 1 | < 0.1% | |
| 68 | 1 | < 0.1% |
reserved_room_type
Categorical
| Distinct count | 10 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| A | |
|---|---|
| D | |
| E | 6535 |
| F | 2897 |
| G | 2094 |
| Other values (5) | 2669 |
| Value | Count | Frequency (%) | |
| A | 85994 | 72.0% | |
| D | 19201 | 16.1% | |
| E | 6535 | 5.5% | |
| F | 2897 | 2.4% | |
| G | 2094 | 1.8% | |
| B | 1118 | 0.9% | |
| C | 932 | 0.8% | |
| H | 601 | 0.5% | |
| P | 12 | < 0.1% | |
| L | 6 | < 0.1% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 10 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 10 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 10 | 100.0% |
assigned_room_type
Categorical
| Distinct count | 12 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| A | |
|---|---|
| D | |
| E | 7806 |
| F | 3751 |
| G | 2553 |
| Other values (7) | 5905 |
| Value | Count | Frequency (%) | |
| A | 74053 | 62.0% | |
| D | 25322 | 21.2% | |
| E | 7806 | 6.5% | |
| F | 3751 | 3.1% | |
| G | 2553 | 2.1% | |
| C | 2375 | 2.0% | |
| B | 2163 | 1.8% | |
| H | 712 | 0.6% | |
| I | 363 | 0.3% | |
| K | 279 | 0.2% | |
| Other values (2) | 13 | < 0.1% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 12 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 12 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 12 | 100.0% |
| Distinct count | 21 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.22112404724013737 |
|---|---|
| Minimum | 0 |
| Maximum | 21 |
| Zeros | 101314 |
| Zeros (%) | 84.9% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 21 |
| Range | 21 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.6523055727 |
|---|---|
| Coefficient of variation (CV) | 2.949953118 |
| Kurtosis | 79.39360467 |
| Mean | 0.2211240472 |
| Median Absolute Deviation (MAD) | 0.3752904217 |
| Skewness | 6.000270054 |
| Sum | 26400 |
| Variance | 0.4255025601 |
| Value | Count | Frequency (%) | |
| 0 | 101314 | 84.9% | |
| 1 | 12701 | 10.6% | |
| 2 | 3805 | 3.2% | |
| 3 | 927 | 0.8% | |
| 4 | 376 | 0.3% | |
| 5 | 118 | 0.1% | |
| 6 | 63 | 0.1% | |
| 7 | 31 | < 0.1% | |
| 8 | 17 | < 0.1% | |
| 9 | 8 | < 0.1% | |
| Other values (11) | 30 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 101314 | 84.9% | |
| 1 | 12701 | 10.6% | |
| 2 | 3805 | 3.2% | |
| 3 | 927 | 0.8% | |
| 4 | 376 | 0.3% |
| Value | Count | Frequency (%) | |
| 21 | 1 | < 0.1% | |
| 20 | 1 | < 0.1% | |
| 18 | 1 | < 0.1% | |
| 17 | 2 | < 0.1% | |
| 16 | 2 | < 0.1% |
deposit_type
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| No Deposit | |
|---|---|
| Non Refund | 14587 |
| Refundable | 162 |
| Value | Count | Frequency (%) | |
| No Deposit | 104641 | 87.6% | |
| Non Refund | 14587 | 12.2% | |
| Refundable | 162 | 0.1% |
Length
| Max length | 10 |
|---|---|
| Mean length | 10 |
| Min length | 10 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 13 | 76.5% | |
| Uppercase_Letter | 3 | 17.6% | |
| Space_Separator | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| Latin | 16 | 94.1% | |
| Common | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| ASCII | 17 | 100.0% |
| Distinct count | 333 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 16340 |
| Missing (%) | 13.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 86.69338185346919 |
|---|---|
| Minimum | 1.0 |
| Maximum | 535.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 9 |
| median | 14 |
| Q3 | 229 |
| 95-th percentile | 250 |
| Maximum | 535 |
| Range | 534 |
| Interquartile range (IQR) | 220 |
Descriptive statistics
| Standard deviation | 110.7745476 |
|---|---|
| Coefficient of variation (CV) | 1.277773981 |
| Kurtosis | -0.007179564938 |
| Mean | 86.69338185 |
| Median Absolute Deviation (MAD) | 97.04657859 |
| Skewness | 1.089385636 |
| Sum | 8933753 |
| Variance | 12271.00041 |
| Value | Count | Frequency (%) | |
| 9 | 31961 | 26.8% | |
| 240 | 13922 | 11.7% | |
| 1 | 7191 | 6.0% | |
| 14 | 3640 | 3.0% | |
| 7 | 3539 | 3.0% | |
| 6 | 3290 | 2.8% | |
| 250 | 2870 | 2.4% | |
| 241 | 1721 | 1.4% | |
| 28 | 1666 | 1.4% | |
| 8 | 1514 | 1.3% | |
| Other values (323) | 31736 | 26.6% | |
| (Missing) | 16340 | 13.7% |
| Value | Count | Frequency (%) | |
| 1 | 7191 | 6.0% | |
| 2 | 162 | 0.1% | |
| 3 | 1336 | 1.1% | |
| 4 | 47 | < 0.1% | |
| 5 | 330 | 0.3% |
| Value | Count | Frequency (%) | |
| 535 | 3 | < 0.1% | |
| 531 | 68 | 0.1% | |
| 527 | 35 | < 0.1% | |
| 526 | 10 | < 0.1% | |
| 510 | 2 | < 0.1% |
| Distinct count | 352 |
|---|---|
| Unique (%) | 5.2% |
| Missing | 112593 |
| Missing (%) | 94.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 189.26673532440782 |
|---|---|
| Minimum | 6.0 |
| Maximum | 543.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 62 |
| median | 179 |
| Q3 | 270 |
| 95-th percentile | 435 |
| Maximum | 543 |
| Range | 537 |
| Interquartile range (IQR) | 208 |
Descriptive statistics
| Standard deviation | 131.6550146 |
|---|---|
| Coefficient of variation (CV) | 0.6956056721 |
| Kurtosis | -0.4907952103 |
| Mean | 189.2667353 |
| Median Absolute Deviation (MAD) | 109.1110502 |
| Skewness | 0.6015996673 |
| Sum | 1286446 |
| Variance | 17333.04288 |
| Value | Count | Frequency (%) | |
| 40 | 927 | 0.8% | |
| 223 | 784 | 0.7% | |
| 67 | 267 | 0.2% | |
| 45 | 250 | 0.2% | |
| 153 | 215 | 0.2% | |
| 174 | 149 | 0.1% | |
| 219 | 141 | 0.1% | |
| 281 | 138 | 0.1% | |
| 154 | 133 | 0.1% | |
| 405 | 119 | 0.1% | |
| Other values (342) | 3674 | 3.1% | |
| (Missing) | 112593 | 94.3% |
| Value | Count | Frequency (%) | |
| 6 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 9 | 37 | < 0.1% | |
| 10 | 1 | < 0.1% | |
| 11 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 543 | 2 | < 0.1% | |
| 541 | 1 | < 0.1% | |
| 539 | 2 | < 0.1% | |
| 534 | 2 | < 0.1% | |
| 531 | 1 | < 0.1% |
| Distinct count | 128 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.321149174972778 |
|---|---|
| Minimum | 0 |
| Maximum | 391 |
| Zeros | 115692 |
| Zeros (%) | 96.9% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 391 |
| Range | 391 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 17.59472088 |
|---|---|
| Coefficient of variation (CV) | 7.580176694 |
| Kurtosis | 186.7930696 |
| Mean | 2.321149175 |
| Median Absolute Deviation (MAD) | 4.49879973 |
| Skewness | 11.94435345 |
| Sum | 277122 |
| Variance | 309.5742028 |
| Value | Count | Frequency (%) | |
| 0 | 115692 | 96.9% | |
| 39 | 227 | 0.2% | |
| 58 | 164 | 0.1% | |
| 44 | 141 | 0.1% | |
| 31 | 127 | 0.1% | |
| 35 | 96 | 0.1% | |
| 46 | 94 | 0.1% | |
| 69 | 89 | 0.1% | |
| 63 | 83 | 0.1% | |
| 50 | 80 | 0.1% | |
| Other values (118) | 2597 | 2.2% |
| Value | Count | Frequency (%) | |
| 0 | 115692 | 96.9% | |
| 1 | 12 | < 0.1% | |
| 2 | 5 | < 0.1% | |
| 3 | 59 | < 0.1% | |
| 4 | 25 | < 0.1% |
| Value | Count | Frequency (%) | |
| 391 | 45 | < 0.1% | |
| 379 | 15 | < 0.1% | |
| 330 | 15 | < 0.1% | |
| 259 | 10 | < 0.1% | |
| 236 | 35 | < 0.1% |
customer_type
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| Transient | |
|---|---|
| Transient-Party | |
| Contract | 4076 |
| Group | 577 |
| Value | Count | Frequency (%) | |
| Transient | 89613 | 75.1% | |
| Transient-Party | 25124 | 21.0% | |
| Contract | 4076 | 3.4% | |
| Group | 577 | 0.5% |
Length
| Max length | 15 |
|---|---|
| Mean length | 10.20914649 |
| Min length | 5 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 12 | 70.6% | |
| Uppercase_Letter | 4 | 23.5% | |
| Dash_Punctuation | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| Latin | 16 | 94.1% | |
| Common | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| ASCII | 17 | 100.0% |
| Distinct count | 8879 |
|---|---|
| Unique (%) | 7.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 101.83112153446686 |
|---|---|
| Minimum | -6.38 |
| Maximum | 5400.0 |
| Zeros | 1959 |
| Zeros (%) | 1.6% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | -6.38 |
|---|---|
| 5-th percentile | 38.4 |
| Q1 | 69.29 |
| median | 94.575 |
| Q3 | 126 |
| 95-th percentile | 193.5 |
| Maximum | 5400 |
| Range | 5406.38 |
| Interquartile range (IQR) | 56.71 |
Descriptive statistics
| Standard deviation | 50.53579029 |
|---|---|
| Coefficient of variation (CV) | 0.4962705853 |
| Kurtosis | 1013.189851 |
| Mean | 101.8311215 |
| Median Absolute Deviation (MAD) | 36.38052772 |
| Skewness | 10.53021398 |
| Sum | 12157617.6 |
| Variance | 2553.8661 |
| Value | Count | Frequency (%) | |
| 62 | 3754 | 3.1% | |
| 75 | 2715 | 2.3% | |
| 90 | 2473 | 2.1% | |
| 65 | 2418 | 2.0% | |
| 0 | 1959 | 1.6% | |
| 80 | 1889 | 1.6% | |
| 95 | 1661 | 1.4% | |
| 120 | 1607 | 1.3% | |
| 100 | 1573 | 1.3% | |
| 85 | 1538 | 1.3% | |
| Other values (8869) | 97803 | 81.9% |
| Value | Count | Frequency (%) | |
| -6.38 | 1 | < 0.1% | |
| 0 | 1959 | 1.6% | |
| 0.26 | 1 | < 0.1% | |
| 0.5 | 1 | < 0.1% | |
| 1 | 15 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5400 | 1 | < 0.1% | |
| 510 | 1 | < 0.1% | |
| 508 | 1 | < 0.1% | |
| 451.5 | 1 | < 0.1% | |
| 450 | 1 | < 0.1% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.06251779881062065 |
|---|---|
| Minimum | 0 |
| Maximum | 8 |
| Zeros | 111974 |
| Zeros (%) | 93.8% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.2452911475 |
|---|---|
| Coefficient of variation (CV) | 3.92354101 |
| Kurtosis | 29.99805617 |
| Mean | 0.06251779881 |
| Median Absolute Deviation (MAD) | 0.1172689171 |
| Skewness | 4.163233238 |
| Sum | 7464 |
| Variance | 0.06016774703 |
| Value | Count | Frequency (%) | |
| 0 | 111974 | 93.8% | |
| 1 | 7383 | 6.2% | |
| 2 | 28 | < 0.1% | |
| 3 | 3 | < 0.1% | |
| 8 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 111974 | 93.8% | |
| 1 | 7383 | 6.2% | |
| 2 | 28 | < 0.1% | |
| 3 | 3 | < 0.1% | |
| 8 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 8 | 2 | < 0.1% | |
| 3 | 3 | < 0.1% | |
| 2 | 28 | < 0.1% | |
| 1 | 7383 | 6.2% | |
| 0 | 111974 | 93.8% |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5713627607002262 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 70318 |
| Zeros (%) | 58.9% |
| Memory size | 932.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.7927984228 |
|---|---|
| Coefficient of variation (CV) | 1.387557043 |
| Kurtosis | 1.492564811 |
| Mean | 0.5713627607 |
| Median Absolute Deviation (MAD) | 0.6730393937 |
| Skewness | 1.349189377 |
| Sum | 68215 |
| Variance | 0.6285293392 |
| Value | Count | Frequency (%) | |
| 0 | 70318 | 58.9% | |
| 1 | 33226 | 27.8% | |
| 2 | 12969 | 10.9% | |
| 3 | 2497 | 2.1% | |
| 4 | 340 | 0.3% | |
| 5 | 40 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 70318 | 58.9% | |
| 1 | 33226 | 27.8% | |
| 2 | 12969 | 10.9% | |
| 3 | 2497 | 2.1% | |
| 4 | 340 | 0.3% |
| Value | Count | Frequency (%) | |
| 5 | 40 | < 0.1% | |
| 4 | 340 | 0.3% | |
| 3 | 2497 | 2.1% | |
| 2 | 12969 | 10.9% | |
| 1 | 33226 | 27.8% |
reservation_status
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| Check-Out | |
|---|---|
| Canceled | |
| No-Show | 1207 |
| Value | Count | Frequency (%) | |
| Check-Out | 75166 | 63.0% | |
| Canceled | 43017 | 36.0% | |
| No-Show | 1207 | 1.0% |
Length
| Max length | 9 |
|---|---|
| Mean length | 8.619473993 |
| Min length | 7 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 12 | 70.6% | |
| Uppercase_Letter | 4 | 23.5% | |
| Dash_Punctuation | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| Latin | 16 | 94.1% | |
| Common | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| ASCII | 17 | 100.0% |
| Distinct count | 926 |
|---|---|
| Unique (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 932.9 KiB |
| 2015-10-21 | 1461 |
|---|---|
| 2015-07-06 | 805 |
| 2016-11-25 | 790 |
| 2015-01-01 | 763 |
| 2016-01-18 | 625 |
| Other values (921) |
| Value | Count | Frequency (%) | |
| 2015-10-21 | 1461 | 1.2% | |
| 2015-07-06 | 805 | 0.7% | |
| 2016-11-25 | 790 | 0.7% | |
| 2015-01-01 | 763 | 0.6% | |
| 2016-01-18 | 625 | 0.5% | |
| 2015-07-02 | 469 | 0.4% | |
| 2016-12-07 | 450 | 0.4% | |
| 2015-12-18 | 423 | 0.4% | |
| 2016-02-09 | 412 | 0.3% | |
| 2016-04-04 | 382 | 0.3% | |
| Other values (916) | 112810 | 94.5% |
Length
| Max length | 10 |
|---|---|
| Mean length | 10 |
| Min length | 10 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Dash_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
| hotel | is_canceled | lead_time | arrival_date_year | arrival_date_month | arrival_date_week_number | arrival_date_day_of_month | stays_in_weekend_nights | stays_in_week_nights | adults | children | babies | meal | country | market_segment | distribution_channel | is_repeated_guest | previous_cancellations | previous_bookings_not_canceled | reserved_room_type | assigned_room_type | booking_changes | deposit_type | agent | company | days_in_waiting_list | customer_type | adr | required_car_parking_spaces | total_of_special_requests | reservation_status | reservation_status_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Resort Hotel | 0 | 342 | 2015 | July | 27 | 1 | 0 | 0 | 2 | 0.0 | 0 | BB | PRT | Direct | Direct | 0 | 0 | 0 | C | C | 3 | No Deposit | NaN | NaN | 0 | Transient | 0.0 | 0 | 0 | Check-Out | 2015-07-01 |
| 1 | Resort Hotel | 0 | 737 | 2015 | July | 27 | 1 | 0 | 0 | 2 | 0.0 | 0 | BB | PRT | Direct | Direct | 0 | 0 | 0 | C | C | 4 | No Deposit | NaN | NaN | 0 | Transient | 0.0 | 0 | 0 | Check-Out | 2015-07-01 |
| 2 | Resort Hotel | 0 | 7 | 2015 | July | 27 | 1 | 0 | 1 | 1 | 0.0 | 0 | BB | GBR | Direct | Direct | 0 | 0 | 0 | A | C | 0 | No Deposit | NaN | NaN | 0 | Transient | 75.0 | 0 | 0 | Check-Out | 2015-07-02 |
| 3 | Resort Hotel | 0 | 13 | 2015 | July | 27 | 1 | 0 | 1 | 1 | 0.0 | 0 | BB | GBR | Corporate | Corporate | 0 | 0 | 0 | A | A | 0 | No Deposit | 304.0 | NaN | 0 | Transient | 75.0 | 0 | 0 | Check-Out | 2015-07-02 |
| 4 | Resort Hotel | 0 | 14 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | BB | GBR | Online TA | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 240.0 | NaN | 0 | Transient | 98.0 | 0 | 1 | Check-Out | 2015-07-03 |
| 5 | Resort Hotel | 0 | 14 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | BB | GBR | Online TA | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 240.0 | NaN | 0 | Transient | 98.0 | 0 | 1 | Check-Out | 2015-07-03 |
| 6 | Resort Hotel | 0 | 0 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | BB | PRT | Direct | Direct | 0 | 0 | 0 | C | C | 0 | No Deposit | NaN | NaN | 0 | Transient | 107.0 | 0 | 0 | Check-Out | 2015-07-03 |
| 7 | Resort Hotel | 0 | 9 | 2015 | July | 27 | 1 | 0 | 2 | 2 | 0.0 | 0 | FB | PRT | Direct | Direct | 0 | 0 | 0 | C | C | 0 | No Deposit | 303.0 | NaN | 0 | Transient | 103.0 | 0 | 1 | Check-Out | 2015-07-03 |
| 8 | Resort Hotel | 1 | 85 | 2015 | July | 27 | 1 | 0 | 3 | 2 | 0.0 | 0 | BB | PRT | Online TA | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 240.0 | NaN | 0 | Transient | 82.0 | 0 | 1 | Canceled | 2015-05-06 |
| 9 | Resort Hotel | 1 | 75 | 2015 | July | 27 | 1 | 0 | 3 | 2 | 0.0 | 0 | HB | PRT | Offline TA/TO | TA/TO | 0 | 0 | 0 | D | D | 0 | No Deposit | 15.0 | NaN | 0 | Transient | 105.5 | 0 | 0 | Canceled | 2015-04-22 |
Last rows
| hotel | is_canceled | lead_time | arrival_date_year | arrival_date_month | arrival_date_week_number | arrival_date_day_of_month | stays_in_weekend_nights | stays_in_week_nights | adults | children | babies | meal | country | market_segment | distribution_channel | is_repeated_guest | previous_cancellations | previous_bookings_not_canceled | reserved_room_type | assigned_room_type | booking_changes | deposit_type | agent | company | days_in_waiting_list | customer_type | adr | required_car_parking_spaces | total_of_special_requests | reservation_status | reservation_status_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 119380 | City Hotel | 0 | 44 | 2017 | August | 35 | 31 | 1 | 3 | 2 | 0.0 | 0 | SC | DEU | Online TA | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 9.0 | NaN | 0 | Transient | 140.75 | 0 | 1 | Check-Out | 2017-09-04 |
| 119381 | City Hotel | 0 | 188 | 2017 | August | 35 | 31 | 2 | 3 | 2 | 0.0 | 0 | BB | DEU | Direct | Direct | 0 | 0 | 0 | A | A | 0 | No Deposit | 14.0 | NaN | 0 | Transient | 99.00 | 0 | 0 | Check-Out | 2017-09-05 |
| 119382 | City Hotel | 0 | 135 | 2017 | August | 35 | 30 | 2 | 4 | 3 | 0.0 | 0 | BB | JPN | Online TA | TA/TO | 0 | 0 | 0 | G | G | 0 | No Deposit | 7.0 | NaN | 0 | Transient | 209.00 | 0 | 0 | Check-Out | 2017-09-05 |
| 119383 | City Hotel | 0 | 164 | 2017 | August | 35 | 31 | 2 | 4 | 2 | 0.0 | 0 | BB | DEU | Offline TA/TO | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 42.0 | NaN | 0 | Transient | 87.60 | 0 | 0 | Check-Out | 2017-09-06 |
| 119384 | City Hotel | 0 | 21 | 2017 | August | 35 | 30 | 2 | 5 | 2 | 0.0 | 0 | BB | BEL | Offline TA/TO | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 394.0 | NaN | 0 | Transient | 96.14 | 0 | 2 | Check-Out | 2017-09-06 |
| 119385 | City Hotel | 0 | 23 | 2017 | August | 35 | 30 | 2 | 5 | 2 | 0.0 | 0 | BB | BEL | Offline TA/TO | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 394.0 | NaN | 0 | Transient | 96.14 | 0 | 0 | Check-Out | 2017-09-06 |
| 119386 | City Hotel | 0 | 102 | 2017 | August | 35 | 31 | 2 | 5 | 3 | 0.0 | 0 | BB | FRA | Online TA | TA/TO | 0 | 0 | 0 | E | E | 0 | No Deposit | 9.0 | NaN | 0 | Transient | 225.43 | 0 | 2 | Check-Out | 2017-09-07 |
| 119387 | City Hotel | 0 | 34 | 2017 | August | 35 | 31 | 2 | 5 | 2 | 0.0 | 0 | BB | DEU | Online TA | TA/TO | 0 | 0 | 0 | D | D | 0 | No Deposit | 9.0 | NaN | 0 | Transient | 157.71 | 0 | 4 | Check-Out | 2017-09-07 |
| 119388 | City Hotel | 0 | 109 | 2017 | August | 35 | 31 | 2 | 5 | 2 | 0.0 | 0 | BB | GBR | Online TA | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 89.0 | NaN | 0 | Transient | 104.40 | 0 | 0 | Check-Out | 2017-09-07 |
| 119389 | City Hotel | 0 | 205 | 2017 | August | 35 | 29 | 2 | 7 | 2 | 0.0 | 0 | HB | DEU | Online TA | TA/TO | 0 | 0 | 0 | A | A | 0 | No Deposit | 9.0 | NaN | 0 | Transient | 151.20 | 0 | 2 | Check-Out | 2017-09-07 |